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The hyperthermophilic glycoside hydrolase family endocellulase 12 from the 
archaeon Pyrococcus furiosus (EGPf; Gene ID PF0854; EC 3.2.1.4) catalyzes 
the hydrolytic cleavage of the y0-l,4-glucosidic linkage in y^-glucan in 
lignocellulose biomass. A crystal of EGPf was previously prepared at pH 9.0 
and its structure was determined at an atomic resolution of 1.07 A. This article 
reports the crystallization of EGPf at the more physiologically relevant pH of 
5.5. Structure determination showed that this new crystal form has the symmetry 
of space group C2. Two molecules of the enzyme are observed in the asymmetric 
unit. Crystal packing is weak at pH 5.5 owing to two flexible interfaces between 
symmetry-related molecules. Comparison of the EGPf structures obtained at pH 
9.0 and pH 5.5 reveals a significant conformational difference at the active 
centre and in the surface loops. The interfaces in the vicinity of the flexible 
surface loops impact the quahty of the EGPf crystal. 



PDB reference: hyperthermophilic endocellu- 
lase, 3wq7 



1. Introduction 




Cellulases are the most important industrial enzymes for biomass 
utilization, since the enzyme plays a key role in the degradation of 13- 
glucan cellulose. Recent research into biofuel production from 
lignocellulose biomass has accelerated the development of cellulases 
optimized for efficient biomass breakdown to monosaccharides, 
known as saccharification. A hyperthermophilic cellulase would be 
very useful in industrial applications because enzymatic reactions 
occurring at high temperature have many merits, such as a reduced 
risk of microbial contamination, increased solubility of the substrate 
and improved transfer rate. Therefore, much research has focused on 
developing thermophilic cellulases with high activity. 

Hyperthermophilic ^-1,4-endocellulases (endo-type cellulases) 
have been found in the genome databases of several hyperthermo- 
philic archaea. The hyperthermophilic archaea Pyrococcus horikoshii 
and P. furiosus have glycoside hydrolase (GH) family 5 (EGPh) and 
family 12 endocellulases (EGPf), respectively. Each family of 
enzymes shows different substrate specificities and exhibits hydrolytic 
activity at high temperatures. The optimal and denaturing tempera- 
tures of EGPh are 100 and 103° C, respectively (Kim & Ishikawa, 
2013), and those of EGPf are 100 and 112°C, respectively (Bauer et 
aL, 1999). The crystal structures of two hyperthermophific endo- 
cellulases have been determined (Kim & Ishikawa, 2010; Kim et al., 
2012). The structure of EGPf was determined at an atomic resolution 
of 1.07 A (Kim et al, 2012; PDB entry 3vgi). Under the crystallization 
conditions used by Kim and coworkers, the EGPf crystal form has 
symmetry consistent with space group P2i2i2 and one EGPf molecule 
is present per asymmetric unit. The optimum pH of EGPf is reported 
to be approximately pH 6.0 (Bauer et al., 1999), but the crystal was 
prepared at pH 9.0 (Kataoka et al., 2012). Here, we describe a new 
crystal form of EGPf prepared at pH 5.5 and compare it with the 
previously solved EGPf structure. 
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2. Materials and methods 

2.1. Protein preparation 

We prepared recombinant EGPf using a method similar to that 
described previously (Kim et al, 2012). The plasmid containing the 
full-length EGPf used the vector pET-lla (Novagen, Madison, 
Wisconsin, USA) and was introduced into Escherichia coli strain 
BL21 (DE3) pLysS. The truncated protein gene (EGPf AN30), with a 
deletion of 30 amino-acid residues (signal sequence and the proline 
and hydroxyl residue-rich regions) at the N-terminal region of EGPf, 
was constructed by PGR and inserted into the pET-lla vector. pET- 
11a was introduced into E. coli BL21 (DE3) cells for recombinant 
protein expression. The cells were cultured in LB containing 
100 mg 1~^ sodium ampicillin at 37° C to an ODeoo of 0-6, and 
isopropyl ^-D-l-thiogalactopyranoside (IPTG) was then added. After 
16 h culture at 30° C, the cells were harvested by centrifugation 
(5000g, 15 min, 4°C). The cells were suspended in 20 mM Tris-HCl 
pH 8.0 containing 0.7 M ammonium sulfate and then homogenized by 
ultrasonication (40 W, 20 kHz) for 15 min on ice. The homogenates 
were heated at 70° C for 15 min. 

After removing the cell debris by centrifugation (15 OOOg, 20 min, 
4°C), the enzyme solution was filtered (0.2 )im) and applied to 
hydrophobic interaction column chromatography (using a HiTrap 
Phenyl column). The flow rate for the column chromatography was 
2.0 ml min~^ using 20 mM Tris-HCl pH 8.0 buffer with a gradient of 
0.7-0 M ammonium sulfate. Gel-filtration chromatography (using a 
HiLoad 26/60 Superdex 200 pg column) was carried out using 20 mM 
Tris-HCl pH 8.0 buffer with a flow rate of 2.0 ml min~^. The purity 
and molecular weight of the protein were analyzed by SDS-PAGE. 
The concentration of EGPf AN30 (molecular weight 30 540.48 Da) 
was determined from the UV absorbance at 280 nm, using 81 790 as 
the molar extinction coefficient, which was calculated from the 
protein sequence (UniProt ID E7FHY8; Gill & von Hippel, 1989). 

2.2. Crystallization 

The purified EGPf AN30 was concentrated to 17 mg ml~^ and then 
dialyzed against 20 mM Tris-HCl pH 8.0 using an Amicon Centricon 
YM-10 (Millipore, Billerica, Massachusetts, USA) by centrifugation 
(5000g, 4°C). The crystals of EGPf AN30 were grown at 22° C using a 
reservoir solution composed of 100 mM CHC (2:3:4 citric acid: 




Figure 1 

A photograph of the EGPfANSO crystals prepared at pH 5.5. The scale bar 
corresponds to 0.5 mm. 



Table 1 

Data-collection and refinement statistics for the structure of EGPf AN30 at pH 5.5. 



Data collection 

Wavelength (A) 0.9 

Space group C2 

Unit-cell parameters (A, °) a ^ 134.7, b = 62.6, 

c = 86.3, = 95.1 

Molecules per asymmetric unit 2 

Matthews coefficient (A^ Da"^) (Matthews, 1968) 2.5 

Solvent content (%) 51 

Resolution range (A) 50.0-1.68 (1.71-1.68) 

Total No. of observed refiections 311544 (~15000) 

No. of unique refiections 81623 (4029) 

Average I/(r{I) 14.9 (3.7) 

Emerge! 0.068 (0.367) 

Multipficity 3.8 (3.8) 

Completeness (%) 99.9 (99.9) 
Refinement 
No. of atoms 

Protein 4390 

Glycerol 102 

Ca"+ 4 

Water ^ 468 

Resolution used in refinement (A) 43.0-1.68 

^workt/^free§ 0.181/0.217 

Wfison B factor (A^) ^ 18 

R.m.s.d., bond distances^ (A) 0.03 

R.m.s.d., bond angles 1 (°) 2.5 

Mean overafi B factor (A^) 27 
Ramachandran plot 

Most favoured regions (%) 96.6 

Disallowed regions (%) 0.0 

PDB code 3wq7 



t Emerge = Em/ U^hkl) - {I{hkl))\ /Y^hki H^hi^Ml wherc lihkl) is the intensity of 
the /th measurement of refiection hkl, including symmetry-related refiections, and 
{I{hkl)) is theh average. X ^work = Em/ l-^obsl l-^calcl|/E/!« l-^obsl- § -^free iS -^work 
for approximately 5% of the refiections that were excluded from the refine- 
ment. ^ R.m.s.d. bond distances and angles are r.m.s.d.s from ideal values (Engh & 
Huber, 1991). 



HEPES:CHES) buffer pH 5.5, 200 mM lithium sulfate, 5%(v/v) 
ethanol by the hanging-drop vapour-diffusion method. Typically, 
drops consisting of 1 )il protein solution and 1 |il reservoir solution 
were equilibrated against 450 )il reservoir solution. 

2.3. Data collection and processing 

The selected crystals were harvested and immersed in cryopro- 
tectant solution consisting of 30%(v/v) glycerol in mother liquor. The 
soaked crystal was collected using a Cryo-Loop (Hampton Research, 
Aliso Viejo, California, USA) and immediately flash-cooled under a 
stream of nitrogen gas at — 173°C. X-ray diffraction data for a single 
crystal measurement were collected using an MX-300HE CCD 
detector (Rayonix, Evanston, Illinois, USA) on the SPring-8 
BL44XU beamline (Hyogo, Japan). The diffraction data set extended 
to 1.68 A resolution and was collected at a wavelength of 0.9 A. The 
crystal-to-detector distance was 220 mm. The crystal was rotated 180° 
with an oscillation angle of 0.5° per frame. The data collected from 
diffraction measurements were merged, indexed, integrated and 
scaled using the programs in the HKL-2000 software package 
(Otwinowski & Minor, 1997). Data-collection statistics are presented 
in Table 1. 

2.4. Structure determination and refinement 

The EGPf AN30 structure was determined by molecular replace- 
ment with MOLREP (Vagin & Teplyakov, 2010) in the CCPA 
package (Winn et al, 2011), using the structure of EGPf AN30 in the 
P2i2i2 form as a search model (Kim et al., 2012; PDB entry 3vgi). 
Structure model building was performed with Coot (Emsley et al., 
2010). The structure was refined using REFMAC5 (Murshudov et al. 
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2011). Water molecules were introduced at peaks over 3 r.m.s.d. in the 
— Fc difference Fourier map fulfilling reasonable interactions with 
the protein model. A Ramachandran plot of the final structure was 
validated using PROCHECK (Laskowski et aL, 1993). The values of 
the r.m.s.d. for comparisons of structures were calculated using 
SUPERPOSE (Krissinel & Henrick, 2004). Figures were prepared 
using PyMOL (DeLano, 2002). The refinement statistics are 
presented in Table 1. The interface between EGPfAN30 and 
symmetric molecules was calculated using PISA {Protein Interfaces, 
Surfaces and Assemblies', Krissinel & Henrick, 2007). 



3. Results and discussion 

3.1. Structure of EGPf at pH 5.5 

EGPf appears to be a secretory enzyme because of its signal 
sequence at the N-terminus. Recombinant EGPf without the signal 
sequence was expressed using the pET system. No recombinant 
enzyme crystals were obtained using the Crystal Screen (Hampton 
Research, Aliso Viejo, Calif onia, USA) or Wizard I and II (Emerald 
Bio, Bainbridge Island, Washington, USA) crystallization screening 
kits. However, the recombinant product of a truncated protein gene 




C-terminal 



Figure 2 

Comparison of the EGPf AN30 structure at pH 5.5 and at pH 9.0 identified by colour: red, pH 5.5; blue, pH 9.0. {a) Wall-eyed stereoview of the overall crystal structure of 
EGPf AN30 drawn as a ribbon model viewed from the front. The two EGPf AN30 structures are superimposed on each other, {b) The structure of the entrance to the active- 
site cleft is changed between ^pHs.s and A^h^ q, as are the structures of the catalytic residues, (c) The r.m.s.d. values of the C" atoms of ^pHs.s/^pHs.s and ^pHs.s/^pHg.o- {d) B 
factors of the amino-acid residues of EGPf AN30 at pH 5.5 (ApHs.s and SpHs.s) and pH 9.0 (^pH9.o)- Hydrogen bonds between the two molecules are indicated by a dotted red 
Hne. (e) Catalytic mechanism of EGPf in the first half-reaction. Here, the typical of a retaining enzyme is depicted in a schematic diagram. The two glucose residues 
correspond to the productive binding mode. pH 9.0: the acidic side chain of Glul78 adjacent to the nucleophile Glul97 maintains a negative charge. pH 5.5: the nucleophile 
Glul97 maintains a negative charge without the side chain of Glul78. 
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(EGPf AN30), in which the N-terminal 30 amino-acid residues (signal 
sequence and the proHne and hydroxyl residue-rich regions) were 
deleted from EGPf, was crystallized at pH 9.0, and the structure was 
determined at 1.07 A resolution (Kim et al., 2012; PDB entry 3vgi). 
Under these crystallization conditions, crystals grew with symmetry 
consistent with space group P2i2i2, and one EGPf molecule is 
present per asymmetric unit. However, the enzymatic optimum pH of 
EGPf was reported to be approximately pH 6.0 (Bauer et al., 1999). 
Therefore, we attempted to prepare crystals at the more physiologi- 
cally relevant pH of 5.0-6.0 in order to obtain the structure of the 
active site in EGPf at these conditions. Based on initial screening 
results, high-quality crystals of EGPf AN30 were obtained using a 
reservoir solution consisting of 100 mM CHC (2:3:4 citric acid: 
HEPES:CHES) buffer pH 5.5, 200 mM lithium sulfate, 5%(v/v) 
ethanol at 22° C. The average size of each crystal was about 0.7 x 0.5 
X 0.3 mm after one week (Fig. 1). This was a new crystal form with 
symmetry consistent with space group C2 that diffracted to 1.68 A 
resolution. The data-collection statistics are summarized in Table 1. 
Determination of the structure of EGPf AN30 was carried out by the 
molecular-replacement method using the previous structural data 
(Kim et al., 2012; PDB entry 3vgi). Two molecules of EGPfAN30, 
labelled ^pH5.5 and Bp^s.s, were identified in the crystallographic 
asymmetric unit. The final model contains two monomer molecules 
with 270 amino-acid residues each. After refinement, the R factors 



were estimated to be R^ork = 0.181 and 7?fj.ee = 0.217. The structure of 
the enzyme consists of a yS-jelly-roll fold (Fig. 2a). 

3.2. Comparison to the previously determined EGPf structure 

The r.m.s.d. of the atoms between ApHs.s and 5pH5.5 was 0.3 A 
(Fig. 2b). In both molecules, the DxDxDG calcium-binding motif 
(Asp68-Glu76 and Aspl42) was present in the loop region between 
the Bl and B2 strands and exhibited high 5-f actor values (Fig. 2c). In 
^pH5.5, poor electron density was observed for the loop regions of B3- 
A5 (Glyl31-Aspl56), B5-B6 (Thrl82-Aspl94) and a-helix-B4 
(Ser272-Glu283), but the regions were interpretable. On the other 
hand, EGPf crystallized at pH 9.0 has symmetry consistent with 
P2i2i2, with unit-cell parameters a = 58.0, b = 118.7, c = 46.8 A (Kim 
et al., 2012; PDB entry 3vgi). Under the previous crystallization 
conditions, one molecule exists in the asymmetric unit (labelled 
^pH9.o)- The r.m.s.d. of the atoms between structures ApHs.s and 
^pH9.o was 0.2 A (Fig. 2b). Comparison of the structures at pH 5.5 and 
9.0 showed that the structures of both main chains are the same. 
However, conformational changes were observed in the side chains of 
Trp62 and Glul78 (Fig. 2a) located at the active-site cleft. Trp62 
seems to contribute to the substrate binding (Kim et al., 2012; PDB 
entry 3vgi). The torsion angle (xi) of the indole ring of Trp62 differs 
by approximately 20° between ^pH5.5 and ^pH9.o- Trp62 is not 




350 



ie) 

Figure 2 (continued) 
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Table 2 

Interfaces between monomers of the determined molecule and the symmetric molecules. 

In the interface, single primes {') refer to interactions with central molecule (ApHs.s or ApH9.o) and double primes {") refer to interactions with central molecule (-SpHs.s). In the interaction 
molecule, single primes {') of the interaction molecules refer to molecules that are in the first layer relative to a central molecule (ApHs.s or Apuo.o) and double primes {") refer to 
molecules that are in a second layer relative to that central molecule (^pHs.s). 

Interacting Interface area Solvation free energy B factor^ 

Interface molecule Symmetry operation (A^) gain (A'G)t (kcal mol~^) Hydrogen bonds Salt bridges (A^) 



^pH5.5 
















1 


A'l 


-x + l/2,y-l/2,-z 


410 


-0.5 


6 


0 


16/16 


2 


A'2 


—x,y, —z 


360 


-6.1 


4 


0 


17/17 


3 


B'l 


-X + 1/2, >; - 1/2, -z 


200 


-1.9 


1 


0 


27/25 


•pH5.5 
















r 


B"l 


-x + l/2,>;-l/2,-z-l 


410 


-2.1 


6 


0 


38/52 


2' 


ApYLS.S 


x,y, z 


380 


-2.0 


6 


0 


17/15 


3' 


A'l 


-X + 1/2, y - 1/2, -z 


310 


1.8 


3 


3 


19/18 


4' 


A"2 


x,y,z — l 


220 


-0.8 


3 


0 


42/27 


^pH9.0 
















1" 


A'l 


X - 1/2, -y + 1/2, -z + 1 


500 


-2.3 


6 


0 


12/13 


2" 


A'2 


-X + 1, -y, z 


480 


-1.3 


14 


2 


12/12 


3" 


A'3 


X - 1/2, -y + 1/2, -z 


220 


-0.5 


1 


0 


14/13 


4" 


A'4 


x,y,z — l 


200 


-3.5 


0 


0 


13/16 



t The sum values of the gain on complex formation for the two surfaces. X Value of B factor at the interface belonging to each monomer. 




Figure 3 

Symmetry-related molecules and interfaces of ^pHs.s and ^pH9.o- The single primes 
(') refer to molecules that are in the first layer relative to ^pHs.s or ^pH9.o and the 
double primes refer to molecules that are in a second layer relative to fipHs.s- {a) 
Seven symmetry-related molecules, drawn as cartoon models, are viewed from the 
front and upper side. Characters in the molecules correspond to the interaction 
molecules in Table 2. {b) Interfaces V and 4' are drawn as tube models. Rainbow 
colours are used to show the high (red) and low (blue) B factors of the amino-acid 
residues. 



particularly mobile as the B factor of Trp62 is ^^20 A^. Trp62 is 
located at subsite —4, at the entrance to the nonreducing side of the 
active-site cleft. On the other hand, the dihedral angle of the carboxyl 
group of Glul78 between ApHs.s and ^pH9.o is approximately 80°. 
Because of this, the distance between Glul78 OEl and Glul97 OE2 
in ApH5.5 (3.2 A) is larger than that in ApH9.o (2.5 A) by 0.7 A. Glul78 
is located to the back of two catalytic residues (Glul97, nucleophile; 
and Glu290, proton donor) and it is thought to be the proton donor to 
Glul97 (cellulase 12A from Thermotoga maritima; Cheng et al., 
2011). Although the 5-factor values for Glul97 of ApHs.s and 5pH5.5 
are 16 and 35 A^, respectively, the conformation of Glul78 did not 
change between the apo forms (^pH5.5 and ^pHs.s)- No other signifi- 
cant differences at the active centre were observed among the A^ng o, 
^pH5.5 and 5pH5.5 molecules. This result suggests that the conforma- 
tional change of Glul78 is due to protonation of the carboxyl group 
of Glul78 and/or Glul97 at acidic pH. The position or state of 
nucleophile Glul97 is stabilized by Glul78 at pH 9.0. Glul78 seems 
to play a role in controlling the optimum pH of the enzymatic activity. 
From the catalytic mechanism of cellulase 12 A from T. maritima 
(Cheng et al, 2011), Glul97 is identified as the catalytic nucleophile 
of EGPf (Fig. 2e). In the first half -reaction, the acidic side chain of 
Glul78 adjacent to the nucleophile Glul97 is believed to maintain a 
negative charge (Fig. 2e), as suggested by Cheng et al. (2011). This 
catalytic mechanism was supported by the structural data at pH 9.0 
(Fig. 2b). However, the structural data at pH 5.5 (Fig. 2b) suggest 
another catalytic mechanism (Fig. 2e). It is speculated that the side 
chain of Glul78 located at a distance of 3.2-3.7 A from the nucleo- 
phile Glul97 is protonated and the nucleophile Glul97 maintains a 
negative charge in the first half-reaction. 

3.3. Interfaces with symmetry-related molecules 

The EGPf AN30 structures at pH 5.5 and at pH 9.0 each interact 
with seven symmetry-related molecules (Fig. 3a, Table 2): single 
primes (') for the interacting molecules refer to molecules that are in 
the first layer relative to a central molecule (ApHs.s or ^pH9.o) and 
double primes ('') refer to molecules that are in a second layer 
relative to the central molecule (Bpus.s)- That is, seven interfaces (1- 
3, 1-4') at pH 5.5 and four interfaces (l"-4") at pH 9.0 were formed 
(Table 2). The interfaces between monomers of the central molecule 
and the symmetry-related molecules are summarized in Table 2. The 
5-factor values of the amino-acid residues of the two determined 
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structures are shown in Fig. 2(c). The average values for ApH5.5, ^pHs.s 
and ^pH9.o are 18, 33 and 12 A^, respectively. In 5pH5.5, the B factors 
of the overall and loop region of the surface are higher than for A^^is.s 
and ^pH9.o- In particular, the B factors of the DxDxDG calcium- 
binding motif and the B5-B6 loop region in ^pHs.s/^pHs.s exhibit 
higher values (44/54 and 19/51 A^) than in ^pH9.o (14 and 11 A^) 
(Figs. 2c and 3h). In 5pH5.5, two interactions, V (5pH5.5-5"l) and A' 
(B-A"2), in these regions have higher 5-f actor values than the other 
five interactions (1, 2, 3, 2' and 3') (Table 2). These interactions are 
likely to weaken the molecular packing because of the high fluctua- 
tion and flexibility of these regions. In contrast to the EGPf AN30 
structures at pH 5.5, ^pH9.o has stronger packing because of the lower 
flexibility of the interfaces. In conclusion, crystal packing is weaker 
and the quality of the EGPf AN30 crystal at pH 5.5 is lower than at 
pH 9.0 because of the flexible interfaces V and 4'. 



The X-ray diffraction data were obtained on beamline BL44XU of 
SPring-8, Hyogo, Japan with the approval of the Institute for Protein 
Research, Osaka University, Osaka, Japan (proposal No. 2013B6803). 
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